22 research outputs found
Parallel three-dimensional simulations of quasi-static elastoplastic solids
Hypo-elastoplasticity is a flexible framework for modeling the mechanics of
many hard materials under small elastic deformation and large plastic
deformation. Under typical loading rates, most laboratory tests of these
materials happen in the quasi-static limit, but there are few existing
numerical methods tailor-made for this physical regime. In this work, we extend
to three dimensions a recent projection method for simulating quasi-static
hypo-elastoplastic materials. The method is based on a mathematical
correspondence to the incompressible Navier-Stokes equations, where the
projection method of Chorin (1968) is an established numerical technique. We
develop and utilize a three-dimensional parallel geometric multigrid solver
employed to solve a linear system for the quasi-static projection. Our method
is tested through simulation of three-dimensional shear band nucleation and
growth, a precursor to failure in many materials. As an example system, we
employ a physical model of a bulk metallic glass based on the shear
transformation zone theory, but the method can be applied to any
elastoplasticity model. We consider several examples of three-dimensional shear
banding, and examine shear band formation in physically realistic materials
with heterogeneous initial conditions under both simple shear deformation and
boundary conditions inspired by friction welding.Comment: Final version. Accepted for publication in Computer Physics
Communication
A continuous-time analysis of distributed stochastic gradient
We analyze the effect of synchronization on distributed stochastic gradient
algorithms. By exploiting an analogy with dynamical models of biological quorum
sensing -- where synchronization between agents is induced through
communication with a common signal -- we quantify how synchronization can
significantly reduce the magnitude of the noise felt by the individual
distributed agents and by their spatial mean. This noise reduction is in turn
associated with a reduction in the smoothing of the loss function imposed by
the stochastic gradient approximation. Through simulations on model non-convex
objectives, we demonstrate that coupling can stabilize higher noise levels and
improve convergence. We provide a convergence analysis for strongly convex
functions by deriving a bound on the expected deviation of the spatial mean of
the agents from the global minimizer for an algorithm based on quorum sensing,
the same algorithm with momentum, and the Elastic Averaging SGD (EASGD)
algorithm. We discuss extensions to new algorithms which allow each agent to
broadcast its current measure of success and shape the collective computation
accordingly. We supplement our theoretical analysis with numerical experiments
on convolutional neural networks trained on the CIFAR-10 dataset, where we note
a surprising regularizing property of EASGD even when applied to the
non-distributed case. This observation suggests alternative second-order
in-time algorithms for non-distributed optimization that are competitive with
momentum methods.Comment: 9/14/19 : Final version, accepted for publication in Neural
Computation. 4/7/19 : Significant edits: addition of simulations, deep
network results, and revisions throughout. 12/28/18: Initial submissio
Deep learning probability flows and entropy production rates in active matter
Active matter systems, from self-propelled colloids to motile bacteria, are
characterized by the conversion of free energy into useful work at the
microscopic scale. These systems generically involve physics beyond the reach
of equilibrium statistical mechanics, and a persistent challenge has been to
understand the nature of their nonequilibrium states. The entropy production
rate and the magnitude of the steady-state probability current provide
quantitative ways to do so by measuring the breakdown of time-reversal symmetry
and the strength of nonequilibrium transport of measure. Yet, their efficient
computation has remained elusive, as they depend on the system's unknown and
high-dimensional probability density. Here, building upon recent advances in
generative modeling, we develop a deep learning framework that estimates the
score of this density. We show that the score, together with the microscopic
equations of motion, gives direct access to the entropy production rate, the
probability current, and their decomposition into local contributions from
individual particles, spatial regions, and degrees of freedom. To represent the
score, we introduce a novel, spatially-local transformer-based network
architecture that learns high-order interactions between particles while
respecting their underlying permutation symmetry. We demonstrate the broad
utility and scalability of the method by applying it to several
high-dimensional systems of interacting active particles undergoing
motility-induced phase separation (MIPS). We show that a single instance of our
network trained on a system of 4096 particles at one packing fraction can
generalize to other regions of the phase diagram, including systems with as
many as 32768 particles. We use this observation to quantify the spatial
structure of the departure from equilibrium in MIPS as a function of the number
of particles and the packing fraction
Implicit regularization and momentum algorithms in nonlinear adaptive control and prediction
Stable concurrent learning and control of dynamical systems is the subject of
adaptive control. Despite being an established field with many practical
applications and a rich theory, much of the development in adaptive control for
nonlinear systems revolves around a few key algorithms. By exploiting strong
connections between classical adaptive nonlinear control techniques and recent
progress in optimization and machine learning, we show that there exists
considerable untapped potential in algorithm development for both adaptive
nonlinear control and adaptive dynamics prediction. We first introduce
first-order adaptation laws inspired by natural gradient descent and mirror
descent. We prove that when there are multiple dynamics consistent with the
data, these non-Euclidean adaptation laws implicitly regularize the learned
model. Local geometry imposed during learning thus may be used to select
parameter vectors - out of the many that will achieve perfect tracking or
prediction - for desired properties such as sparsity. We apply this result to
regularized dynamics predictor and observer design, and as concrete examples
consider Hamiltonian systems, Lagrangian systems, and recurrent neural
networks. We subsequently develop a variational formalism based on the Bregman
Lagrangian to define adaptation laws with momentum applicable to linearly
parameterized systems and to nonlinearly parameterized systems satisfying
monotonicity or convexity requirements. We show that the Euler Lagrange
equations for the Bregman Lagrangian lead to natural gradient and mirror
descent-like adaptation laws with momentum, and we recover their first-order
analogues in the infinite friction limit. We illustrate our analyses with
simulations demonstrating our theoretical results.Comment: v6: cosmetic adjustments to figures 4, 5, and 6. v5: final version,
accepted for publication in Neural Computation. v4: significant updates,
revamped section on dynamics prediction and exploiting structure. v3: new
general theorems and extensions to dynamic prediction. 37 pages, 3 figures.
v2: significant updates; submission read
Stochastic Interpolants: A Unifying Framework for Flows and Diffusions
A class of generative models that unifies flow-based and diffusion-based
methods is introduced. These models extend the framework proposed in Albergo &
Vanden-Eijnden (2023), enabling the use of a broad class of continuous-time
stochastic processes called `stochastic interpolants' to bridge any two
arbitrary probability density functions exactly in finite time. These
interpolants are built by combining data from the two prescribed densities with
an additional latent variable that shapes the bridge in a flexible way. The
time-dependent probability density function of the stochastic interpolant is
shown to satisfy a first-order transport equation as well as a family of
forward and backward Fokker-Planck equations with tunable diffusion. Upon
consideration of the time evolution of an individual sample, this viewpoint
immediately leads to both deterministic and stochastic generative models based
on probability flow equations or stochastic differential equations with an
adjustable level of noise. The drift coefficients entering these models are
time-dependent velocity fields characterized as the unique minimizers of simple
quadratic objective functions, one of which is a new objective for the score of
the interpolant density. Remarkably, we show that minimization of these
quadratic objectives leads to control of the likelihood for any of our
generative models built upon stochastic dynamics. By contrast, we establish
that generative models based upon a deterministic dynamics must, in addition,
control the Fisher divergence between the target and the model. We also
construct estimators for the likelihood and the cross-entropy of
interpolant-based generative models, discuss connections with other stochastic
bridges, and demonstrate that such models recover the Schr\"odinger bridge
between the two target densities when explicitly optimizing over the
interpolant
Manifold learning for coarse-graining atomistic simulations: Application to amorphous solids
We introduce a generalized machine learning framework to probabilistically
parameterize upper-scale models in the form of nonlinear PDEs consistent with a
continuum theory, based on coarse-grained atomistic simulation data of
mechanical deformation and flow processes. The proposed framework utilizes a
hypothesized coarse-graining methodology with manifold learning and
surrogate-based optimization techniques. Coarse-grained high-dimensional data
describing quantities of interest of the multiscale models are projected onto a
nonlinear manifold whose geometric and topological structure is exploited for
measuring behavioral discrepancies in the form of manifold distances. A
surrogate model is constructed using Gaussian process regression to identify a
mapping between stochastic parameters and distances. Derivative-free
optimization is employed to adaptively identify a unique set of parameters of
the upper-scale model capable of rapidly reproducing the system's behavior
while maintaining consistency with coarse-grained atomic-level simulations. The
proposed method is applied to learn the parameters of the shear transformation
zone (STZ) theory of plasticity that describes plastic deformation in amorphous
solids as well as coarse-graining parameters needed to translate between
atomistic and continuum representations. We show that the methodology is able
to successfully link coarse-grained microscale simulations to macroscale
observables and achieve a high-level of parity between the models across
scales.Comment: 34 pages, 12 figures, references added, Section 4 added, Section 2.1
update